Using MT-Based Metrics for RTE

نویسندگان

Alexander Volokh

Günter Neumann

چکیده

We analyse the complexity of the RTE task data and divide the T/H pairs into three different classes, depending on the type of knowledge required to solve the problem. We then propose an approach which is suitable for the easier two classes, which account for two thirds of all pairs. Our assumption is that T and H are translations of the same source sentence. We then use a metric for MT evaluation (Meteor) in order to judge the similarity of both translations. It is clear that in most cases when T entails H, T and H do not have exactly the same meaning. However, we can observe that the similarity is still much higher for positive T/H pairs than for negative pairs. We achieve a result of 46.34 macro-average F1-score for the task. On one hand-side, it shows that our approach has its weaknesses especially because our assumption that T and H contain the same meaning does not always hold, especially if T and H have very different lengths. On the other hand considering the fact that RTE-7 is a difficult class-imbalanced problem (<5% YES, >95% NO) this robust approach achieves a decent result for a large amount of data. It is above the median of this year's results and is comparable with the top results from the previ-

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language

Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...

متن کامل

Regression for Sentence-Level MT Evaluation with Pseudo References

Many automatic evaluation metrics for machine translation (MT) rely on making comparisons to human translations, a resource that may not always be available. We present a method for developing sentence-level MT evaluation metrics that do not directly rely on human reference translations. Our metrics are developed using regression learning and are based on a set of weaker indicators of fluency a...

متن کامل

Sensitivity of Automated MT Evaluation Metrics on Higher Quality MT Output: BLEU vs Task-Based Evaluation Methods

We report the results of an experiment to assess the ability of automated MT evaluation metrics to remain sensitive to variations in MT quality as the average quality of the compared systems goes up. We compare two groups of metrics: those which measure the proximity of MT output to some reference translation, and those which evaluate the performance of some automated process on degraded MT out...

متن کامل

Combining Confidence Estimation and Reference-based Metrics for Segment-level MT Evaluation

We describe an effort to improve standard reference-based metrics for Machine Translation (MT) evaluation by enriching them with Confidence Estimation (CE) features and using a learning mechanism trained on human annotations. Reference-based MT evaluation metrics compare the system output against reference translations looking for overlaps at different levels (lexical, syntactic, and semantic)....

متن کامل

Choosing the Best MT Programs for CLIR Purposes - Can MT Metrics Be Helpful?

This paper describes usage of MT metrics in choosing the best candidates for MT-based query translation resources. Our main metrics is METEOR, but we also use NIST and BLEU. Language pair of our evaluation is English German, because MT metrics still do not offer very many language pairs for comparison. We evaluated translations of CLEF 2003 topics of four different MT programs with MT metrics a...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

Using MT-Based Metrics for RTE

نویسندگان

چکیده

منابع مشابه

The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language

Regression for Sentence-Level MT Evaluation with Pseudo References

Sensitivity of Automated MT Evaluation Metrics on Higher Quality MT Output: BLEU vs Task-Based Evaluation Methods

Combining Confidence Estimation and Reference-based Metrics for Segment-level MT Evaluation

Choosing the Best MT Programs for CLIR Purposes - Can MT Metrics Be Helpful?

عنوان ژورنال:

اشتراک گذاری